Back to our maxent model for monticola…

library(ENMTools)
## Loading required package: raster
## Loading required package: sp
## Loading required package: dismo
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:raster':
## 
##     intersect, select, union
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(ggplot2)
data("iberolacerta.clade")
data("euro.worldclim")

monticola <- iberolacerta.clade$species$monticola
monticola.mx <- enmtools.maxent(monticola, euro.worldclim,
                                test.prop = 0.3)
## 
## 
## No background points provided, drawing background from range raster.
## 
## 
## Drawing background from species background points.
## 
## 
## 
## 
## Drawing background from species background points.
interactive.plot.enmtools.model(monticola.mx)
## Warning in wkt(projfrom): CRS object has no comment
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## ellps WGS 84 in CRS definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0
## +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum WGS_1984 in CRS definition
## Warning in wkt(pfrom): CRS object has no comment
## Warning in rgdal::rawTransform(projfrom, projto, nrow(xy), xy[, 1], xy[, : Using
## PROJ not WKT2 strings
## Warning in wkt(pfrom): CRS object has no comment
## Warning in rgdal::rawTransform(projfrom, projto, nrow(xy), xy[, 1], xy[, : Using
## PROJ not WKT2 strings
## Warning in rgdal::rawTransform(projto_int, projfrom, nrow(xy), xy[, 1], : Using
## PROJ not WKT2 strings
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## ellps WGS 84 in CRS definition: +proj=merc +a=6378137 +b=6378137 +lat_ts=0
## +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs
## Warning in showSRID(uprojargs, format = "PROJ", multiline = "NO"): Discarded
## datum WGS_1984 in CRS definition
## Warning in wkt(pfrom): CRS object has no comment
## Warning in rgdal::rawTransform(projfrom, projto, nrow(xy), xy[, 1], xy[, : Using
## PROJ not WKT2 strings

Model evaluation in ENMTools

If you look into the model object, you will find “training.evaluation” and “test.evaluation”. These are dismo evaluation objects containing a lot of information about your model.

monticola.mx$training.evaluation
## class          : ModelEvaluation 
## n presences    : 182 
## n absences     : 512 
## AUC            : 0.8597935 
## cor            : 0.5651833 
## max TPR+TNR at : 0.4760822
monticola.mx$test.evaluation
## class          : ModelEvaluation 
## n presences    : 78 
## n absences     : 512 
## AUC            : 0.8289263 
## cor            : 0.4232471 
## max TPR+TNR at : 0.4760822

By default the summary gives you the AUC, correlation, and the threshold value that maximizes the sum of true positive and true negative predictions. However, there’s a lot more information in these objects than that. You can find those by going to the help page for the evaluate function in dismo. To access these elements you’ll need to use the @ symbol, because ModelEvaluation objects S4 objects. You can also use the plotting functions from dismo to visualize these metrics.

help(evaluate)
## Help on topic 'evaluate' was found in the following packages:
## 
##   Package               Library
##   generics              /Library/Frameworks/R.framework/Versions/4.0/Resources/library
##   evaluate              /Library/Frameworks/R.framework/Versions/4.0/Resources/library
##   dismo                 /Library/Frameworks/R.framework/Versions/4.0/Resources/library
## 
## 
## Using the first match ...
monticola.mx$training.evaluation@kappa
##   [1] 0.000000000 0.001025358 0.002052624 0.003081803 0.004112901 0.005145924
##   [7] 0.006180876 0.006180876 0.007217762 0.009297362 0.010340087 0.011384768
##  [13] 0.012431411 0.013480022 0.014530607 0.015583170 0.015583170 0.017694257
##  [19] 0.018752791 0.018752791 0.020875869 0.021940424 0.023006999 0.024075597
##  [25] 0.025146226 0.026218890 0.026218890 0.028370350 0.029449157 0.030530024
##  [31] 0.031612956 0.031612956 0.033785038 0.034874201 0.035965453 0.035965453
##  [37] 0.038154248 0.039251803 0.040351471 0.041453259 0.042557172 0.042557172
##  [43] 0.044771399 0.045881725 0.046994202 0.048108835 0.049225631 0.050344596
##  [49] 0.051465737 0.052589059 0.053714569 0.053714569 0.055972180 0.055972180
##  [55] 0.058238620 0.059375167 0.060513941 0.061654949 0.062798196 0.063943690
##  [61] 0.065091438 0.066241445 0.067393719 0.068548267 0.066707001 0.067864295
##  [67] 0.069023882 0.070185768 0.071349960 0.072516465 0.072516465 0.072516465
##  [73] 0.076029924 0.077205749 0.077205749 0.079564449 0.080747337 0.081932595
##  [79] 0.083120228 0.084310244 0.085502651 0.086697455 0.087894663 0.089094283
##  [85] 0.090296323 0.091500789 0.091500789 0.093917030 0.095128820 0.096343066
##  [91] 0.097559776 0.097559776 0.100000616 0.101224762 0.102451401 0.103680542
##  [97] 0.104912192 0.106146358 0.107383050 0.108622273 0.108622273 0.111108348
## [103] 0.112355215 0.113604645 0.114856647 0.116111229 0.117368398 0.118628162
## [109] 0.118628162 0.121155510 0.122423109 0.123693336 0.124966199 0.126241706
## [115] 0.127519866 0.128800686 0.130084176 0.131370343 0.132659195 0.133950742
## [121] 0.135244992 0.136541953 0.137841633 0.139144042 0.140449188 0.141757079
## [127] 0.143067724 0.144381132 0.145697312 0.147016273 0.148338022 0.149662570
## [133] 0.150989924 0.152320095 0.153653090 0.154988919 0.154988919 0.154988919
## [139] 0.159013501 0.159013501 0.161710892 0.163063916 0.164419838 0.165778667
## [145] 0.167140413 0.168505085 0.169872693 0.171243246 0.172616753 0.173993225
## [151] 0.175372670 0.176755099 0.178140521 0.179528945 0.180920382 0.182314842
## [157] 0.183712333 0.185112867 0.186516453 0.187923101 0.189332822 0.190745624
## [163] 0.192161519 0.193580516 0.195002626 0.196427858 0.197856224 0.199287734
## [169] 0.200722398 0.202160226 0.203601229 0.205045417 0.206492801 0.207943392
## [175] 0.209397200 0.210854236 0.212314511 0.212314511 0.215244820 0.216714877
## [181] 0.218188215 0.219664847 0.221144783 0.222628035 0.224114613 0.224114613
## [187] 0.227097794 0.228594419 0.230094416 0.231597797 0.233104571 0.231369026
## [193] 0.231369026 0.234395728 0.235914233 0.237436191 0.238961612 0.240490509
## [199] 0.240490509 0.243558779 0.245098175 0.246641095 0.248187550 0.249737554
## [205] 0.251291117 0.252848253 0.254408974 0.255973292 0.257541220 0.257541220
## [211] 0.257541220 0.259112769 0.263849277 0.265435442 0.267025292 0.268618840
## [217] 0.270216099 0.271817083 0.273421805 0.275030276 0.276642512 0.278258524
## [223] 0.279878326 0.279878326 0.283129354 0.284760607 0.286395704 0.288034659
## [229] 0.289677484 0.291324194 0.292974803 0.294629325 0.296287773 0.297950162
## [235] 0.299616505 0.301286816 0.302961111 0.302961111 0.306321706 0.304659706
## [241] 0.304659706 0.308040523 0.309737028 0.311437618 0.313142306 0.314851109
## [247] 0.316564040 0.318281115 0.320002349 0.321727756 0.323457353 0.323457353
## [253] 0.326929176 0.328671432 0.330417939 0.332168712 0.333923767 0.335683120
## [259] 0.337446786 0.339214782 0.340987123 0.342763825 0.344544905 0.346330378
## [265] 0.348120261 0.349914571 0.351713323 0.353516535 0.355324222 0.357136402
## [271] 0.358953091 0.360774307 0.359177742 0.361006070 0.362838976 0.364676477
## [277] 0.366518591 0.368365334 0.370216725 0.368627871 0.370486608 0.372350047
## [283] 0.374218204 0.376091099 0.377968749 0.379851172 0.381738386 0.383630410
## [289] 0.385527262 0.383955221 0.385859778 0.387769219 0.386194859 0.388112106
## [295] 0.390034297 0.391961450 0.393893584 0.395830719 0.397772874 0.396208143
## [301] 0.398158368 0.400113673 0.402074079 0.396971885 0.398938452 0.397358274
## [307] 0.399333115 0.401313162 0.403298435 0.405288954 0.407284742 0.409285818
## [313] 0.411292203 0.413303919 0.415320988 0.417343430 0.419371266 0.421404520
## [319] 0.423443212 0.425487363 0.427536997 0.429592136 0.424449910 0.426511978
## [325] 0.428579621 0.430652862 0.432731724 0.434816228 0.436906400 0.439002260
## [331] 0.441103833 0.443211143 0.445324212 0.440155018 0.442275502 0.444401821
## [337] 0.446534000 0.448672063 0.450816034 0.452965938 0.455121800 0.457283643
## [343] 0.459451494 0.454254887 0.456430622 0.458612444 0.460800380 0.462994456
## [349] 0.465194697 0.467401129 0.462172137 0.464386780 0.466607700 0.468834921
## [355] 0.471068473 0.473308381 0.475554673 0.475554673 0.480066517 0.482332125
## [361] 0.484604227 0.486882850 0.489168024 0.491459776 0.493758135 0.496063130
## [367] 0.494604032 0.496920279 0.499243251 0.501572978 0.501572978 0.498657141
## [373] 0.501003071 0.491865636 0.490363949 0.488853360 0.491216076 0.493585817
## [379] 0.492075826 0.494457465 0.492943301 0.495336957 0.497737804 0.500145874
## [385] 0.502561199 0.497119900 0.499545283 0.498023570 0.500461408 0.502906681
## [391] 0.505359424 0.507819670 0.498316373 0.500784724 0.503260697 0.505744326
## [397] 0.505744326 0.506707839 0.509212288 0.511724542 0.510198088 0.500500481
## [403] 0.503026715 0.505560921 0.508103138 0.510653402 0.513211753 0.507535311
## [409] 0.510105015 0.512682932 0.515269101 0.513712362 0.516312665 0.510568692
## [415] 0.508981565 0.503154964 0.505776230 0.499895636 0.502528772 0.492283332
## [421] 0.490598669 0.493246986 0.495904120 0.481083226 0.483746714 0.486419173
## [427] 0.489100650 0.482935802 0.485629581 0.488332526 0.486585479 0.489303926
## [433] 0.489303926 0.494768808 0.497515340 0.500271331 0.484973526 0.487736466
## [439] 0.485947685 0.484146367 0.473084244 0.471218235 0.469338962 0.467446284
## [445] 0.470251952 0.473067662 0.471174780 0.469268250 0.462570986 0.465413650
## [451] 0.453809830 0.446928744 0.449784488 0.452650774 0.445699372 0.438670121
## [457] 0.441553143 0.444446968 0.437344417 0.430161118 0.427993406 0.425809139
## [463] 0.428733740 0.431669551 0.434616638 0.432436853 0.430240195 0.433213081
## [469] 0.431006618 0.418290552 0.410696773 0.392250178 0.362398020 0.362398020
## [475] 0.334636552 0.337596227 0.323365497 0.320544726 0.317700339 0.308959073
## [481] 0.311939516 0.297126785 0.294121458 0.285039904 0.275840344 0.278833000
## [487] 0.278833000 0.234840132 0.237814820 0.227989256 0.230975268 0.233974708
## [493] 0.230514414 0.227022716 0.210313055 0.193294493 0.175958385 0.158295759
## [499] 0.154313731 0.143259360 0.146235563 0.135016545 0.123634441 0.119367309
## [505] 0.107730495 0.088473258 0.061221388 0.056489063 0.043959427 0.031237398
## [511] 0.018318511 0.005198160 0.000000000 0.000000000
plot(monticola.mx$test.evaluation, 'ROC')

plot(monticola.mx$test.evaluation, 'kappa')

boxplot(monticola.mx$test.evaluation)

density(monticola.mx$test.evaluation)

There’s been a lot of back-and-forth about these various metrics (e.g. here: http://onlinelibrary.wiley.com/doi/10.1111/j.1365-2664.2006.01214.x/full), but they’re pretty much all different flavors of “how well does your model tell presence data from absence or background data”. I’m not entirely convinced that the argument is all that important, though, given results like these from thousands of simulations:

Kappa vs. TSS AUC vs. TSS AUC vs. Kappa

You’ll notice that kappa is the least correlated with the other two, which is likely due to variation in prevalence across studies. It’s still pretty highly correlated, though. In summary: although there may be differences between these metrics that are important on some sort of absolute scale, they are unlikely to give you radically different answers when choosing between alternative models built using the same data. I’ll talk more about how we might make better use of these metrics in a little while.

In addition to the traditional methods of measuring model fit for ENM/SDM studies, ENMTools implements a new way of thinking about model fit. If we think of models as estimates of the species’ responses to a set of environmental predictors instead of as a formula to generate a map, we might want to evaluate how well our model fits our data in environment space, not just in geographic space. This presents some difficulties, however; environment spaces can have an arbitrary number of dimensions, have no set boundaries, and have both continuous and discrete axes.

However, we can get an approximation of the fit of our model in environment space by Latin hypercube sampling of the space of possible environmental combinations. ENMTools does this automatically when you build a model, evaluating the ability of your model to distinguish presence and absence (or pseudoabsence) points within the environment space defined by the minima and maxima of your predictors in the environmental layers you pass to it.

Compare the fit of some of the models above in environment space to their fit in geographic space.

monticola.gam <- enmtools.gam(species = monticola, env = euro.worldclim, f = pres ~ bio3 + bio8 + bio11, test.prop = 0.3, nback = 400)
## 
## 
## No background points provided, drawing background from range raster.
## Adding environmental data to species monticola
##  Processing presence points...
##  Processing background points...
## 
## 
## Drawing background from species background points.
## 
## 
## 
## 
## Drawing background from species background points.
monticola.gam$test.evaluation
## class          : ModelEvaluation 
## n presences    : 78 
## n absences     : 400 
## AUC            : 0.8234135 
## cor            : 0.3880281 
## max TPR+TNR at : 0.6743041
monticola.gam$env.test.evaluation
## class          : ModelEvaluation 
## n presences    : 78 
## n absences     : 10000 
## AUC            : 0.644991 
## cor            : 0.05074645 
## max TPR+TNR at : 0.4439556

Spatially structured model evaluation

Let’s look again at our model’s training and test data.

plot(monticola.gam)

Notice how the geographic range of the training and test data is quite similar? Because of that similarity, and because the environment is spatially autocorrelated, our training and test data will often tend to be quite similar in the set of environments they encompass. This creates a situation where it’s often relatively easy for a model to achieve good performance on randomly withheld test data. In contrast, if we’re using our models to predict to different times and spaces, the projections we make can be made into quite different areas of environment space. When this is the case, the performance of our model on randomly withheld test data may be a poor indicator of how well it might extrapolate to new conditions.

For this reason, some investigators have proposed that we instead evaluate our models on spatially structured subsets of our data. We are currently incorporating this feature into ENMTools; it is working now, but we plan to expand on this functionality signficiantly. At present we use the block partitioning functionality of ENMeval. This procedure partitions the data into four blocks by splitting the data in half horizontally and vertically, and then holds out the data from one of those blocks for testing purposes. We can do this with our ENMTools models just by passing “block” to the test.prop argument.

monticola.gam <- enmtools.gam(species = monticola, env = euro.worldclim, f = pres ~ bio3 + bio8 + bio11, test.prop = "block", nback = 400)
## 
## 
## No background points provided, drawing background from range raster.
## Adding environmental data to species monticola
##  Processing presence points...
##  Processing background points...
## 
## 
## Drawing background from species background points.
## 
## 
## 
## 
## Drawing background from species background points.
monticola.gam
## 
## 
## Formula:  presence ~ bio3 + bio8 + bio11
## <environment: 0x7fbb37c20950>
## 
## 
## Data table (top ten lines): 
## 
## |   | Longitude| Latitude| bio1| bio2| bio3| bio4| bio5| bio6| bio7| bio8| bio9| bio10| bio11| bio12| bio13| bio14| bio15| bio16| bio17| bio18| bio19| presence|
## |:--|---------:|--------:|----:|----:|----:|----:|----:|----:|----:|----:|----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|-----:|--------:|
## |1  | -5.171215| 43.06957|   78|  100|   40| 5056|  223|  -26|  249|   52|  145|   146|    18|   917|   113|    48|    24|   305|   169|   171|   251|        1|
## |2  | -6.036635| 43.02531|   76|  100|   40| 5052|  220|  -26|  246|   48|  143|   144|    16|  1012|   128|    46|    28|   345|   166|   171|   302|        1|
## |3  | -7.679727| 40.38852|  137|   94|   38| 5318|  281|   34|  247|   72|  208|   208|    72|  1143|   172|    11|    55|   465|    72|    72|   465|        1|
## |4  | -7.790437| 40.30959|  129|   88|   36| 5333|  271|   29|  242|   65|  200|   200|    65|  1231|   187|    12|    56|   504|    76|    76|   504|        1|
## |6  | -6.575039| 42.91070|   84|   99|   40| 5134|  228|  -19|  247|   28|  152|   152|    23|  1012|   129|    39|    32|   358|   145|   145|   328|        1|
## |7  | -5.132756| 43.49572|  133|   78|   41| 3998|  235|   45|  190|  115|  184|   186|    84|   822|   110|    40|    28|   284|   145|   160|   213|        1|
## |8  | -7.787378| 40.39362|  137|   94|   38| 5318|  281|   34|  247|   72|  208|   208|    72|  1143|   172|    11|    55|   465|    72|    72|   465|        1|
## |9  | -4.941888| 43.35310|  128|   78|   40| 4073|  233|   39|  194|  111|  180|   183|    78|   843|   111|    41|    27|   291|   149|   164|   222|        1|
## |10 | -7.621731| 40.34170|  101|   76|   33| 5359|  236|    7|  229|   39|  173|   173|    39|  1514|   234|    15|    57|   621|    92|    92|   621|        1|
## |11 | -7.645674| 40.36543|  101|   76|   33| 5359|  236|    7|  229|   39|  173|   173|    39|  1514|   234|    15|    57|   621|    92|    92|   621|        1|
## 
## 
## Model:  
## Family: binomial 
## Link function: logit 
## 
## Formula:
## presence ~ bio3 + bio8 + bio11
## 
## Parametric coefficients:
##               Estimate Std. Error z value Pr(>|z|)
## (Intercept)  0.6425901  2.3879684   0.269    0.788
## bio3        -0.0179853  0.0619595  -0.290    0.772
## bio8         0.0007854  0.0048622   0.162    0.872
## bio11        0.0004189  0.0063015   0.066    0.947
## 
## 
## R-sq.(adj) =  -0.00758   Deviance explained = 0.0238%
## -REML = 282.34  Scale est. = 1         n = 383
## 
## 
## Model fit (training data):  class          : ModelEvaluation 
## n presences    : 195 
## n absences     : 188 
## AUC            : 0.5061102 
## cor            : 0.0181502 
## max TPR+TNR at : 0.03881896 
## 
## 
## Environment space model fit (training data):  class          : ModelEvaluation 
## n presences    : 195 
## n absences     : 10000 
## AUC            : 0.2980159 
## cor            : -0.08596339 
## max TPR+TNR at : 0.4887027 
## 
## 
## Proportion of data wittheld for model fitting:  block
## 
## Model fit (test data):  class          : ModelEvaluation 
## n presences    : 65 
## n absences     : 16 
## AUC            : 0.2192308 
## cor            : -0.5643963 
## max TPR+TNR at : -0.01589649 
## 
## 
## Environment space model fit (test data):  class          : ModelEvaluation 
## n presences    : 65 
## n absences     : 10000 
## AUC            : 0.2502338 
## cor            : -0.06230069 
## max TPR+TNR at : 0.4960391 
## 
## 
## Suitability:  
## class      : RasterLayer 
## dimensions : 54, 162, 8748  (nrow, ncol, ncell)
## resolution : 0.1666667, 0.1666667  (x, y)
## extent     : -10, 17, 39, 48  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## source     : memory
## names      : layer 
## values     : 0.4875973, 0.5790201  (min, max)
## 
## 
## 
## Notes:  
## NULL

monticola.gam$response.plots
## $bio3

## 
## $bio8

## 
## $bio11

visualize.enm(monticola.gam, euro.worldclim, layers = c("bio11", "bio8"), plot.test.data = TRUE)
## $background.plot
## Warning: Removed 396 rows containing missing values (geom_raster).

## 
## $suit.plot

Calibration

One alternative to discrimination metrics is to measure calibration. Rather than simply asking how well an idealized threshold applied to your model separates your data into presence and background points, calibration metrics measure how accurately your suitability scores estimate the true frequency of presences vs. background or absence points in a given set of grid cells.

Measuring calibration using ENMTools is simple; ENMTools model objects are set up to talk to the CalibratR package already.

monticola.gam <- enmtools.gam(species = monticola, env = euro.worldclim, f = pres ~ bio3 + bio8 + bio11, test.prop = 0.3, nback = 400)
## 
## 
## No background points provided, drawing background from range raster.
## Adding environmental data to species monticola
##  Processing presence points...
##  Processing background points...
## 
## 
## Drawing background from species background points.
## 
## 
## 
## 
## Drawing background from species background points.
mont.cal <- enmtools.calibrate(monticola.gam)
## Registered S3 methods overwritten by 'adehabitatMA':
##   method                       from
##   print.SpatialPixelsDataFrame sp  
##   print.SpatialPixels          sp
mont.cal

## Calibration metrics for uncalibrated model: 
## 
## |       ECE|       MCE| ECE.equal.width| MCE.equal.width| boyce.index|
## |---------:|---------:|---------------:|---------------:|-----------:|
## | 0.1174837| 0.0284488|       0.1144909|       0.0240359|       0.906|
## 
##  Hosmer and Lemeshow goodness of fit (GOF) test
## 
## data:  this.pa, pred.df$prob
## X-squared = 24.738, df = 9, p-value = 0.003275
## 
## NULL

The plot in the top left measures how correlated your predicted and observed probability of presence are, while the histograms show the distribution of suitability values at presence and absence or background sites. The metrics in the table in the lower left are various measures of calibration. The expected calibration error and maximum calibration error are essentially measures of how often one will make mistakes in predicting presence of the species either across all suitability bins (ECE) or in the bin with the maximum error rate (MCE). These metrics assume that the ration of presence to background points in your test data matches the true prevalence of the species across the projected study area, and if you don’t know that’s the case it’s probably not worth interpreting these scores too rigorously.

The continuous Boyce index is simply a Spearman rank correlation coefficient measuring the correlation between predicted and observed probabilities of presence. It is not sensitive to prevalence, and as such is probably the most useful metric for most cases.

In addition to measuring model calibration we can REcalibrate models. This is essentially a matter of fitting a post hoc model to our SDM that modifies its suitability scores to better estimate the frequency of observations in the test data. Again it’s worth noting that these methods assume that the presence/background ratio in your test data is similar to the prevalence of your species in the real world, so if you don’t know that’s the case these recalibrations should be taken with a grain of salt. Phillips and Elith came up with a method for readjusting these recalibrations for prevalence, but these are not currently implemented in ENMTools.

Be aware that the following code will take a few minutes; some of these methods are quite cpu-heavy. There’s also currently an issue with the parallel package that’s causing the CalibratR package to fail recalibration unless you paste some nonsense in here, so that’s what the first bit is here.

## WORKAROUND: https://github.com/rstudio/rstudio/issues/6692
## Revert to 'sequential' setup of PSOCK cluster in RStudio Console on macOS and R 4.0.0
if (Sys.getenv("RSTUDIO") == "1" && !nzchar(Sys.getenv("RSTUDIO_TERM")) && 
    Sys.info()["sysname"] == "Darwin" && getRversion() >= "4.0.0") {
  parallel:::setDefaultClusterOptions(setup_strategy = "sequential")
}

mont.cal <- enmtools.calibrate(monticola.gam, recalibrate = TRUE)
## Warning in scale_me(new, min, max): A new instance exceeded the min value of the calibration training model
##             and was set to 0 value
## Warning in scale_me(new, min, max): A new instance exceeded the max value of the calibration training model
##             and was set to 1 value

## Warning in scale_me(new, min, max): A new instance exceeded the max value of the calibration training model
##             and was set to 1 value
mont.cal
## $original

## 
## $scaled

## 
## $transformed

## 
## $hist_scaled

## 
## $hist_transformed

## 
## $BBQ_scaled_sel

## 
## $BBQ_scaled_avg

## 
## $BBQ_transformed_sel

## 
## $BBQ_transformed_avg

## 
## $GUESS_1

## 
## $GUESS_2

## 
## Calibration metrics for uncalibrated model: 
## 
## |       ECE|      MCE| ECE.equal.width| MCE.equal.width| boyce.index|
## |---------:|--------:|---------------:|---------------:|-----------:|
## | 0.1349731| 0.032403|       0.1245421|       0.0272329|       0.906|
## 
##  Hosmer and Lemeshow goodness of fit (GOF) test
## 
## data:  this.pa, pred.df$prob
## X-squared = 27.96, df = 9, p-value = 0.0009686
## 
## 
## 
## |                    |Recalibration       |       ECE|       MCE| ECE.equal.width| MCE.equal.width| boyce.index|
## |:-------------------|:-------------------|---------:|---------:|---------------:|---------------:|-----------:|
## |original            |original            | 0.1442400| 0.0378800|       0.1245400|       0.0272300|       0.906|
## |scaled              |scaled              | 0.1442400| 0.0378800|       0.1245400|       0.0272300|       0.908|
## |transformed         |transformed         | 0.2451263| 0.0417453|       0.1677677|       0.0408127|       0.919|
## |hist_scaled         |hist_scaled         | 0.1192110| 0.0236953|       0.1140817|       0.0234687|       0.527|
## |hist_transformed    |hist_transformed    | 0.1054237| 0.0260040|       0.0398260|       0.0166463|       1.000|
## |BBQ_scaled_sel      |BBQ_scaled_sel      | 0.0785767| 0.0174510|       0.0653230|       0.0164207|       0.500|
## |BBQ_scaled_avg      |BBQ_scaled_avg      | 0.0883040| 0.0175457|       0.0771333|       0.0164597|       0.868|
## |BBQ_transformed_sel |BBQ_transformed_sel | 0.0756710| 0.0165330|       0.0654190|       0.0165250|       0.500|
## |BBQ_transformed_avg |BBQ_transformed_avg | 0.0802150| 0.0171733|       0.0695297|       0.0167387|       0.890|
## |GUESS_1             |GUESS_1             | 0.1001463| 0.0236600|       0.1025243|       0.0443893|       0.841|
## |GUESS_2             |GUESS_2             | 0.1299960| 0.0287307|       0.1188947|       0.0411490|       0.827|
## NULL

We now have our model recalibrated a bunch of different ways! Let’s see what those maps look like. Fair warning: CalibratR returns a LOT of information, so these objects can be challenging to navigate. You may notice a disconnect between the apparent correlation in the plot in the top left and the Boyce index values in the table. The Boyce index is actually being calculated over different data; it’s calculated using 100 bins for the x axis and the predicted/expected ratio for the y axis. As such it tends to be unhappy about models for which a lot of those suitability “bins” are empty. We’re still working on better ways of presenting these different approaches.

We used the recalibrated models to make suitability rasters, though, which you can find under mont.cal$calibrated.suitabilities.

plot(mont.cal$calibrated.suitabilities$BBQ_scaled_avg)
points(monticola$presence.points, pch = 16)

Statistical significance of ENM predictions

Spatial autocorrelation makes it somewhat difficult to know how to actually assess these metrics for ENMs. Many of the null expectations we have for uninformative models don’t include various aspects of our data or predictors, and as such could tend to underestimate the expected performance of an uninformative model.

Bahn and McGill test

Bahn and McGill suggested that, since environmental variables are spatially autocorrelated, the performance of a model should be compared to the performance of a model built using the same points on spatial predictors that are spatially autocorrelated but not biologically relevant. They created a set of fake predictors for “eastness” and “northness”, and measured how well models fit occurrence points using these predictors.

lat <- lon <- euro.worldclim[[1]]
xy <- coordinates(euro.worldclim[[1]])
lon[] <- xy[, 1]
lat[] <- xy[, 2]

fake.env <- stack(c(lon, lat))
names(fake.env) <- c("fakelon", "fakelat")
plot(fake.env)

fake.env <- mask(fake.env, euro.worldclim[[1]])
fake.env <- setMinMax(fake.env)
plot(fake.env)

fake.gam <- enmtools.gam(monticola, fake.env, test.prop = 0.3)
## 
## 
## No background points provided, drawing background from range raster.
## Adding environmental data to species monticola
##  Processing presence points...
##  Processing background points...
## 
## 
## Drawing background from species background points.
## 
## 
## 
## 
## Drawing background from species background points.
fake.gam
## 
## 
## Formula:  presence ~ s(fakelon, k = 4) + s(fakelat, k = 4)
## <environment: 0x7fbb1b683d60>
## 
## 
## Data table (top ten lines): 
## 
## |   | Longitude| Latitude|   fakelon|  fakelat| presence|
## |:--|---------:|--------:|---------:|--------:|--------:|
## |1  | -5.171215| 43.06957| -5.250000| 43.08333|        1|
## |2  | -6.036635| 43.02531| -6.083333| 43.08333|        1|
## |3  | -7.679727| 40.38852| -7.750000| 40.41667|        1|
## |5  | -7.473340| 43.78935| -7.416667| 43.75000|        1|
## |8  | -7.787378| 40.39362| -7.750000| 40.41667|        1|
## |9  | -4.941888| 43.35310| -4.916667| 43.41667|        1|
## |10 | -7.621731| 40.34170| -7.583333| 40.41667|        1|
## |11 | -7.645674| 40.36543| -7.583333| 40.41667|        1|
## |12 | -7.642539| 40.36317| -7.583333| 40.41667|        1|
## |15 | -7.100000| 42.93000| -7.083333| 42.91667|        1|
## 
## 
## Model:  
## Family: binomial 
## Link function: logit 
## 
## Formula:
## presence ~ s(fakelon, k = 4) + s(fakelat, k = 4)
## 
## Parametric coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -0.2658     0.1271  -2.091   0.0365 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Approximate significance of smooth terms:
##              edf Ref.df Chi.sq  p-value    
## s(fakelon) 1.867      3  17.58 8.05e-05 ***
## s(fakelat) 1.491      3  34.75 1.05e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## R-sq.(adj) =  0.188   Deviance explained =   15%
## -REML = 222.26  Scale est. = 1         n = 694
## 
## 
## Model fit (training data):  class          : ModelEvaluation 
## n presences    : 182 
## n absences     : 512 
## AUC            : 0.7574047 
## cor            : 0.3672156 
## max TPR+TNR at : 0.4364892 
## 
## 
## Environment space model fit (training data):  class          : ModelEvaluation 
## n presences    : 182 
## n absences     : 10000 
## AUC            : 0.8587363 
## cor            : 0.1924005 
## max TPR+TNR at : 0.4724793 
## 
## 
## Proportion of data wittheld for model fitting:  0.3
## 
## Model fit (test data):  class          : ModelEvaluation 
## n presences    : 78 
## n absences     : 512 
## AUC            : 0.773763 
## cor            : 0.3042442 
## max TPR+TNR at : 0.1009525 
## 
## 
## Environment space model fit (test data):  class          : ModelEvaluation 
## n presences    : 78 
## n absences     : 10000 
## AUC            : 0.8764064 
## cor            : 0.1336883 
## max TPR+TNR at : 0.5251417 
## 
## 
## Suitability:  
## class      : RasterLayer 
## dimensions : 54, 162, 8748  (nrow, ncol, ncell)
## resolution : 0.1666667, 0.1666667  (x, y)
## extent     : -10, 17, 39, 48  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0 
## source     : memory
## names      : layer 
## values     : 4.526567e-07, 0.9901261  (min, max)
## 
## 
## 
## Notes:  
## [1] "No formula was provided, so a GAM formula was built automatically"

What do you think about the results when compared with your empirical GAM? What are your thoughts on the null hypothesis this model represents?

Monte Carlo tests of model fit

The Bahn and McGill test sets a baseline expected performance based on a specific model of spatial autocorrelation, but our empirical predictors may have different levels of autocorrelation as well as multicollinearity. How can we incorporate our empirical modeling conditions into our null expectation? In other words, how well CAN a model be expected to fit the data, given the local level of environmental heterogeneity and collinearity, and the spatial autocorrelation present in the occurrence data?

Even if we can factor these in to get the expected performance of an uninformative model, the average behaviour for an uninformative model is only one part of the answer. Without error bars around that expectation, we have no idea whether a particular model is reliable or not. For that reason, Raes and ter Steege came up with a Monte Carlo randomization test based on generating replicate models using uninformative point data. This method keeps the predictors and study region from our empirical model, but generates models using uninformative locality data.

Let’s build a single uninformative model so we can see what that looks like.

fake.species <- monticola

# We're just going to sample some random points from the background and call them presence points
fake.species$presence.points <- as.data.frame(randomPoints(fake.species$range, nrow(fake.species$presence.points)))

fake.gam <- enmtools.gam(species = fake.species, env = euro.worldclim, f = pres ~ bio3 + bio8, test.prop = 0.3, nback = 400)
## 
## 
## No background points provided, drawing background from range raster.
## Warning in randomPoints(species$range, nback, species$presence.points):
## generated random points = 0.63 times requested number
## Adding environmental data to species monticola
##  Processing presence points...
##  Processing background points...
## 
## 
## Drawing background from species background points.
## 
## 
## 
## 
## Drawing background from species background points.
plot(fake.gam)

visualize.enm(fake.gam, euro.worldclim, layers = c("bio3", "bio8"))
## $background.plot
## Warning: Removed 396 rows containing missing values (geom_raster).

## 
## $suit.plot

Notice how there is spatial and environmental structure to our model even though the data itself carries no biological information? This is the behaviour that the Raes and ter Steege tests are intended to correct for.

These tests are now implemented in the ENMTools modeling functions, and are accessed using the “rts.reps” argument.

monticola.gam <- enmtools.gam(species = monticola, env = euro.worldclim, f = pres ~ bio3 + bio8, test.prop = 0.3, nback = 400, rts.reps = 100)

monticola.gam$rts.test$rts.pvalues
## $rts.geog.training.pvalue
## [1] 0.02
## 
## $rts.env.training.pvalue
## [1] 0.05
## 
## $rts.geog.test.pvalue
## [1] 0
## 
## $rts.env.test.pvalue
## [1] 0.14
monticola.gam$rts.test$rts.plots
## $geog.training.plot

## 
## $env.training.plot

## 
## $geog.test.plot

## 
## $env.test.plot

monticola.glm <- enmtools.glm(species = monticola, env = euro.worldclim, f = pres ~ poly(bio3, 4) + poly(bio8, 4), test.prop = 0.3, nback = 400, rts.reps = 100)

monticola.glm$rts.test$rts.pvalues
## $rts.geog.training.pvalue
## [1] 0
## 
## $rts.env.training.pvalue
## [1] 0.04
## 
## $rts.geog.test.pvalue
## [1] 0
## 
## $rts.env.test.pvalue
## [1] 0.04
monticola.glm$rts.test$rts.plots
## $geog.training.plot

## 
## $env.training.plot

## 
## $geog.test.plot

## 
## $env.test.plot

visualize.enm(monticola.glm, euro.worldclim, layers = c("bio1", "bio8"), plot.test.data = TRUE)
## $background.plot

## 
## $suit.plot

plot(monticola.glm)

It’s worth noting that the Raes and ter Steege method was initially proposed to measure significance of fit on training data only - all of the data would be used for fitting the model, and the fit of the model to that data would be evaluated using the Monte Carlo test. ENMTools’ implementation allows you to also evaluate model fit to test data by randomly resampling training and test data from the training region for each replicate. If you want to do the “classic” test as R&tS proposed, just set test.prop to zero.

More recently Bohl et al. presented a method that modifies these by keeping the test data from the empirical data set and just randomly resampling training data. This test isn’t implemented in ENMTools yet, but we can fake it! The models for our rts reps with test.prop = 0.3 are exactly what we need, we just need to evaluate their fit to the empirical test data.

You’ll find the models from the replicates stashed deep in the monticola.gam object. For rep 1, for instance, the model is at monticola.gam\(rts.test\)rts.models\(rep.1\)model. To evaluate the fit of that model to our empirical test points you could just do this:

test.pres <- monticola.gam$test.data
test.bg <- monticola.gam$analysis.df %>%
  filter(presence == 0) %>%
  select(Longitude, Latitude)

dismo::evaluate(test.pres, test.bg,
                monticola.gam$rts.test$rts.models$rep.1$model,
                euro.worldclim)
## class          : ModelEvaluation 
## n presences    : 78 
## n absences     : 400 
## AUC            : 0.5119071 
## cor            : -0.01730136 
## max TPR+TNR at : -0.5639016

Okay, so can we do that to all of the rts.rep models at once?

test.pres <- monticola.gam$test.data
test.bg <- monticola.gam$analysis.df %>%
  filter(presence == 0) %>%
  select(Longitude, Latitude)


bohl.test <- function(thismodel){
  dismo::evaluate(test.pres, test.bg, thismodel, euro.worldclim)
}


null.dist <-sapply(monticola.gam$rts.test$rts.models, 
       FUN = function(x) bohl.test(x$model)@auc)
null.dist <- c(monticola.gam$test.evaluation@auc, null.dist)
names(null.dist)[1] <- "empirical"

qplot(null.dist, geom = "histogram", fill = "density", alpha = 0.5) +
        geom_vline(xintercept = null.dist["empirical"], linetype = "longdash") +
        xlim(0,1) + guides(fill = FALSE, alpha = FALSE) + xlab("AUC") +
        ggtitle(paste("Model performance in geographic space on test data")) +
        theme(plot.title = element_text(hjust = 0.5))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## Warning: Removed 2 rows containing missing values (geom_bar).

So there are three different Monte Carlo tests to evaluate model performance against a null distribution. Can you say in words what each of those null distributions means? Or to put it another way, what are the conditions under which a model is expected to outperform each null distribution? Think about what a model would have to predict better than expected for each.

Warren et al. (in review) test for model bias

Okay, remember how we projected a model we had to a future climate scenario?

spain.future <- raster::getData('CMIP5', var = "bio", res = 10, rcp = 85, model = 'AC', year = 70)

spain.future <- crop(x = spain.future, y = euro.worldclim)
names(spain.future) <- names(euro.worldclim)

future.pred <- predict(monticola.gam, spain.future)
future.pred$suitability

Real quick let’s just measure the average change in habitat suitability predicted between the present and future climate scenario.

change.raster <- future.pred$raster - monticola.gam$suitability

plot(change.raster)

So mostly it looks like habitat quality is predicted to decline in the places where the species lives right now. Let’s just get the average change in habitat suitability.

mean.change <- mean(getValues(change.raster), na.rm = TRUE)

mean.change
## [1] -0.1083963

So on average there’s a decline in suitability scores of 0.1 predicted over the next 50 years. Is it possible that some of that is due to biases, though? For instance, if there were a ton of habitat in the future that represent climates for which there is no current analog, would that induce some bias in the changes we predict? Are there also biases produced by our choice of algorithm, given that some tend to extrapolate quite a bit and others not at all?

It turns out we can do something very similar to our Monte Carlo tests for model fit and use it to get estimates of the bias built into a given ENM/SDM study design. Basically we just take our empirical study design (algorithm, study area, and future climate scenario) and run randomly selected data points from the study area through it. Then we measure the same summary statistics on those uninformative models as we do on our real models (e.g., average suitability change above) and compare our empirical results to our null distribution!

Again this isn’t currently implemented as a test in ENMTools (the paper is still in review), but you can hand-roll it fairly easily. If we want to look at the gam, we can even just reuse our rts reps!

bias.test <- function(thismodel){
  this.change <- predict(spain.future, thismodel, type = "response") - monticola.gam$suitability
  this.mean.change <- mean(getValues(this.change), na.rm = TRUE)
  return(this.mean.change)
}


null.dist <-sapply(monticola.gam$rts.test$rts.models, 
       FUN = function(x) bias.test(x$model))
null.dist <- c(mean.change, null.dist)
names(null.dist)[1] <- "empirical"

qplot(null.dist, geom = "histogram", fill = "density", alpha = 0.5) +
        geom_vline(xintercept = null.dist["empirical"], linetype = "longdash") + 
  guides(fill = FALSE, alpha = FALSE) + xlab("Mean change") +
        ggtitle(paste("Change in average suitability of habitat")) +
        theme(plot.title = element_text(hjust = 0.5))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

We find that our empirical model predicts much more of a decline in suitability than expected if our occurrence data were random. This strongly suggests that, whatever the reason the suitability is expected to decline, it is not simply a methodological artifact.